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Abstract 


Bounds are established for log cross-product ratios (log odds ratios) involving pairs of 
items for item response models. First, expressions for bounds on log cross-product ratios 
are provided for unidimensional item response models in general. Then, explicit bounds are 
obtained for the Rasch model and the two-parameter logistic (2PL) model. Results are also 
illustrated through an example from a study of model-checking procedures. The bounds 
obtained can provide a basis for assessment of goodness of fit of these models. 
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1 Introduction 


Latent-variable models for item responses have strong implications for customary 
descriptive measures from contingency table analysis such as log cross-product ratios 
(Section 2). In the case of the Rasch model, this issue has been suggested by simulation 
studies intended to explore model diagnostics for the Rasch model (Sinharay & Johnson, 
2003). These studies indicated that the log cross-product ratios predicted by the Rasch 
model showed remarkably little variability among different pairs of items. This note seeks 
to explain the observed results from simulation and to provide general bounds for log-cross 
product ratios for some familiar item response models. These bounds have importance 
in derivation of starting values for algorithms for item response analysis and for checking 
of models. Section 2 provides the required theoretical results. Section 3 illustrates their 
application to an example concerned with model checking. Section 4 considers application 
of results to point-biserial correlations and tables of log cross-product ratios. 


2 Theoretical Results 

The desired bounds for log cross-product ratios can be obtained without loss of 
generality by study of just two items because all items are conditionally independent given 
the ability parameters of an item response model. General expressions for log cross-product 
ratios will be provided for general one-dimensional item response models. Results will be 
illustrated by use of the Rasch and 2PL models. 

Consider the relationship between two item responses X\ and X 2 . Let each Xj be a 
random variable with values 0 or 1 , and let X = (Xi,X 2 ), and let p(x) = p(x 1 , 2 : 2 ) be the 
probability that X = x = ( x\,x 2 ) for a two-dimensional vector x with coordinates 0 or 1. 
In the analysis of contingency tables, the log cross-product ratio (also known as log odds 
ratio) of X x and X 2 is 


pjQjMM) 

p(l, 0 )p( 0 ,l)' 


(i) 


The coefficient 7 is positive if, and only if, X\ and X 2 are positively correlated, for the 
correlation of X\ and X 2 is 


p(l,l)p( 0 , 0 )-p(l, 0 )p( 0 ,l) 

[pi( 0 )pi(l)p 2 ( 0 )p 2 (l )] 1/2 
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where Pj(x) is the probability that Xj = x (Bishop, Fienberg, & Holland, 1975, p. 381). 

In general, the formula for the correlation implies that p is between 1 — exp(— 7 ) and 
exp (7) - 1 . 

In this note, the implications of item response models on the parameter 7 are considered 
for models with a one-dimensional ability variable. One basic result, Theorem 1, is that 7 
must be positive for an item response model with an ability distribution not concentrated 
at a single point and with monotone increasing item characteristic curves for the items. 
Theorems 2 and 3 provide bounds on 7 for the Rasch model. Similar bounds are also 
considered for the 2PL model. The bounds are especially easy to apply if the ability 
distribution is normal. 

To define the implications of an item response model on the log cross-product ratio 7, 
let the ability variable 9 be a real random variable with distribution function F, let the Xj 
be conditionally independent given 6, let Pj(9) > 0 be the probability that Xj = 1 given 9 , 
let Qj(9) = 1 — Pj{9) > 0 be the corresponding probability that Xj = 0 given 6 , and let 


\j(6) = log 


PM 

QM 


be the item logit function (ILF) of Xj. Then the probability p(x) satisfies 


**>= fn^QX 

J 3 =1 


dF. 


( 2 ) 


The log cross-product ratio can be expressed in terms of the cumulant generating 
function of the item logit functions Xj conditional on X\ and X 2 both being 0. To verify 
this claim, let A be the vector of X v 1 < j <2, and let x'A be Xp=i x j\- Let V be Q\Q 2 - 
Let t\ and t 2 be real numbers, and let t be the two-dimensional vector with coordinates t\ 
and f 2 • Let 0 = (0,0). Let 


M(t) = M(t 1 ,t 2 ) = \p(0)}- 1 


exp (t'A) I XdF 


( 3 ) 


be the moment generating function at t of A (9) given X = 0, and let C(t) = C(ti,t 2 ) = 
log M(t) be the conditional cumulant generating function of A (9) at t given that X = 0. 
The Dutch identity (Holland, 1990) states that 


p(x) = p(0)M(x). 


( 4 ) 
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By (4), 


( 5 ) 


7 = C(1,1)-C(1,0)-C(0,1)+C(0,0). 


Thus the log cross-product ratio can be expressed in terms of the cumulant generating 
function C. 

The cumulant generating function C is closely related to the conditional cumulants 
of A (9) given X = 0. Let / be the set of integer pairs i = (*i, *2) such that i\ and i 2 are 
nonnegative and at least one of i\ and 12 is positive, and let J be the set of i in / with 
both ii and i 2 positive. Provided that, for some real t'm > 0, M(t) is finite whenever 
|t | 2 = t\ + t\ < r 2 M , there exists an rc > 0 such that, for |t| < rc , 


E K .fllfl2 

. 

7. i llJ 


i ei 


l\\l2'- 


The coefficient /q = K nt . 2 is the conditional product cumulant of A (9) given X = 0 
corresponding to the conditional expectation of A^ 1 (0) A 2 2 (0) given X = 0. For example, 

/C01 is the conditional mean of Ai(0), K20 is the conditional variance of Ai(d), and Hu{9) is 
the conditional covariance of Ai(0) and A2 (0). If rc > 2 1//2 , then 7 has the power series 
expansion 

E K i 

The expansion suggests a crude approximation of 7 by the conditional covariance 71 of 
Ai(0) and \2(9), with a more refined approximation by K\\ + (/c 2 1 + /«i2)/2. If the conditional 
distribution of A (6) given X\ = X 2 = 0 is bivariate normal, then 7 is exactly equal to Kn. 
For any distribution of 9, if Ai(0) or A 2 {9) is constant, so that Ad or X 2 is independent of 


9, then 7 = /c n = 0. 


A more general expression for 7 can be derived by consideration of a new random 
vector based on A. This result is always available. Standard convexity properties of moment 
generating functions imply that M(t) is finite for 0 < tj < 1, 1 < j < 2. As a consequence, 
if 0 < tj < 1 for 1 < j < 2 and A(t) is a random variable with distribution function 

[Af(t)p(0)] _1 f exp(t , A )VdF 

J — OO 

at x real, then the two-dimensional random vector A(t) with coordinates Aj(t) = Aj(A(t)) 
for 1 < j < 2 has finite moments of all orders. In particular, the expectation /q(t) of Aj(t), 
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1 < j <2, and the covariance r(t) of A^t) and A 2 (t) are defined and finite. The moment 
generating function of A(t) is M(t + u)/M(t) at u if 0 < u 3 + tj < 1 for 1 < j < 2. To 
explore required integrals derived from C, let T be uniformly distributed on the unit square 
S of t with 0 < tj < 1 for 1 < j < 2. 

Use of the mean value theorem of calculus shows that 

7 = £(t(T)) 

is the expected conditional covariance of A^T) and A 2 (T) given T. 

The following theorem provides a simple condition that ensures that log cross-product 
ratio 7 is positive. It is already known that 7 > 0 (Holland & Rosenbaum, 1986). 

Theorem 1 Let Ai and A 2 be monotone increasing functions. Assume that no constant c 
exists such that 6 is c with a probability of 1. Then 7 is positive. 

Proof. It suffices to show that r(t) is positive. To verify that r(t) is positive, let A'{ t) 
be a random variable independent of A(t) with the same distribution as A(t). Then 2r(t) 
is the expected value of the product 

U = [A x (^(t)) - A 1 (^4 / (t))][A 2 (^4(t)) - A 2 (A'(t))]. 

If Ai and A 2 are monotone increasing, then U is nonnegative, and the probability is positive 
that U is positive. || 

In the special case of a Rasch model with item difficulties fi 3 for j from 1 to 2, 

A j{6) = 6 — /3j, so that r(t) reduces to the variance of A(t), and 7 is the expected 
conditional variance of A(T) given T. If the conditional moment generating function of 9 
given X = 0 is finite in an open interval that includes 0, if /q denotes the ith conditional 
cumulant of 6 given X = 0, and if the conditional cumulant generating function K of 6 
given X = 0 satisfies K(t) is finite and 

OO 

K(t) = Y. Kjf/i\ 

i =1 

for \t\ < Tk for some t'k > 2 1 / 2 , then 

OO 

7 = X)(2 i -2)/c i /i!. 

i=2 
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If the conditional distribution of 6 given X = 0 is a normal distribution with variance 
of, then 7 = of. 

If 6 has a continuous distribution with a positive density / that is twice differentiable, 
then a lower bound on 7 may be obtained as in the following theorem. 


Theorem 2 Let the Rasch model hold, and let 6 have continuous positive and twice differ¬ 
entiable density f. Assume that real 5 > 0 and real c > 0 exist such that the derivative fi of 
f satisfies the condition that 

|/i (z + a)| < cf(z), —00 < z < 00 , |a| < 5. 

Let g = log/, let g± be the derivative of g, and let g 2 be the second derivative of g. Let 
—g 2 (A(t)) have a finite positive variance ?/(t) for each t in S. Let r}j(t) be the expectation 
of Pj(A(t)Qj(A(t)) for j equal 1 or 2. Then 

t>£(Wt) + „(t)+*(t)]- 1 ). 


Proof. For t in S, the density of Aft) is 

2 

h = H- 1 f]\P* i Q )- ti t 
3 = 1 

where 



so that e = log h has first derivative 

2 

e i = 91 

3 =1 


and second derivative 


The hrst derivative of h is 


2 

e2 = 92 - ^2 PjQj- 

3 =1 

h\ = ei h. 
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Consider estimation of the expectation of a random variable Z under the model that 
Z — a has the distribution of A(t) for some real a with |a| < 5. Apply the Cramer-Rao 
inequality (Cramer, 1946, p. 475). Then elementary calculations show that 

r (t) > fa(t) + 77i(t) + 772 (f)] -1 . 

The conclusion follows. || 

In many cases, an upper bound on 7 may be established as in the following theorem. 


Theorem 3 Let the Rasch model hold, and let 6 have continuous positive and twice differ¬ 
entiable density f. Let g = log/, let g\ be the derivative of g, and let g 2 be the second 
derivative of g. For some constant b > 0, let g 2 < —b. Then 7 < 1/b. 


Proof. From the proof of Theorem 2, it follows that e 2 < —b. As a consequence, log h 
must achieve a maximum at some real z. 

For any random variable Y with mean /i and variance cr 2 , 

E([Y-z] 2 ) = a 2 + (z-p) 2 . 


Thus the variance r(t) of A(t) does not exceed the expected value of [A(t) — z] 2 , so that 

r(t) < J ( uvw ), 

where, for a real, 


u[a) = (a — z) , 
v(a) = h(a)/w(a), 


and 

w(a) = (2'K/bY 1 ^ 2 exp[— bu(a)/2]. 
Because h and w are density functions, 


/ 


w = 


1 


and 


J (vw) = 1 . 
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Because w is the density function of the normal distribution with mean z and variance b 


i 


j ( uw) = b 1 . 


Use of standard formulas for changes of variables yields 


and 


where, for a > 0 , 


(uvw) = / (u*v*w*), 


(v*w*) = 1, 


w* = 1, 


[u*w*) = b 1 , 


u*(a) = a, 


v*(a) = [v(z + a 1//2 ) + v(z — a 1 ^ 2 }/ 2, 


and 

w*(a) = (2a7T/by 1 ^ 2 ex.p(—ba/2) 

(Cramer, 1946, p. 168). 

It follows that, for any real d, 

POO POO 

/ (u* - b-^(v* - d)w * = / (u*v*w*) - b~\ 

Jo Jo 

The definition of z and the assumptions of the theorem imply that v* is a decreasing 
function, so that the choice of d = u*(6 -1 ) implies y — (u* — — d) is negative except 

at 6 _1 , and y(& _1 ) = 0. Thus r(t) is less than 6 _1 . It follows that 7 is less than b~ l . || 

In the important special case of 9 with a normal distribution with mean y and variance 
a 2 , g 2 in Theorem 2 is the constant —1/u 2 , so that r](t 2 ) = l/cr 2 and r(t) and 7 are both at 
least 2a 2 /(2-\-a 2 ). On the other hand, it also follows from Theorem 3 that 7 is less than a 2 . 
So, for example, for a 2 = 1 , the lower and upper bounds on 7 are 2/3 and 1, respectively. 
For fixed y and a 2 , if \fi 3 \ approaches 00 for j equals 1 and 2, then 7 approaches a 2 . The 
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proof of this claim is an application of Scheffe’s theorem (Scheffe, 1947). For example, if f3\ 
and j3 2 both approach — oo, then multiplication of the numerator and denominator by 

exp[—(1 - fi)/3i - (1 - t 2 )f3 2 ] 

shows that, in the proof of Theorem 2, h converges to the density of a normal random 
variable with mean /x — (1 — ti) — (1 — 1 2 ) and variance cr 2 . It follows that rjj( t 2 ) converges to 
0, so that the lower bound for 7 converges to a 2 . Because a 2 is also the upper bound for 7 , 
7 converges to a 2 . Minor variations on the same argument apply if some (3j approaches 00 . 

The arguments in Theorems 2 and 3 are readily applied to the 2PL model of item 
response theory. In this case, A j{6) = a,j{6 — (3j ) for an item difficulty (3j and an item 
discrimination aj > 0. The definition of 7l(t) is changed clue to the new definition of the A j\ 
however, the remaining changes are quite limited. In Theorem 2, the lower bound becomes 

aia 2 E([r}(T) + ad?/i(T) + ay^T)] x ), 

and the upper bound becomes a\a 2 /b in Theorem 3. In the case of 6 with a normal 
distribution with mean /x and variance a 2 , a lower bound is 

4o 1 a 2 0 ‘ 2 
4 + (of + a\)a 2 

and an upper bound is a\a 2 a 2 . The previous result for the Rasch model is obtained with 

CL\ — Ci 2 — 1. 


3 Example 

In a study of assessment of fit of common models in item response theory (Sinharay & 
Johnson, 2003), prediction of log cross-product ratios for item pairs was examined under 
the Rasch model for /x = 0 and a 2 = 1. The authors reported that the predicted log 
cross-product ratios (from the Rasch model) among the item pairs fall within a very narrow 
range, all around 0.73 in their limited simulation study. These results are consistent with 
the bounds of 2/3 and 1 established in Section 2. To corroborate their findings, numerical 
integration was employed to compute 7 using (1) for this case (Rasch model with /x = 0 
and a 2 = 1) with j3\ and (3 2 on a grid of integer values between —4 and 4. The values of the 



log cross-product ratios are summarized in Table 1. As evident in Table 1, the smallest 7 is 
observed for /A = /? 2 = 0. The largest values are obtained for /3i and /3 2 large in magnitude 
and opposite in sign. The log cross-prodnct ratios show little variation, all falling between 
0.71 and 0.90. 


Table 1. 

Predicted Log Cross-Product Ratios for Item Pairs Under the Rasch Model 


Difficulty 

of Item 1 

-4 

-3 

-2 

Difficulty of Item 2 

-1 0 1 

2 

3 

4 

-4 

0.90 

0.87 

0.83 

0.81 

0.81 

0.84 

0.88 

0.92 

0.95 

-3 

0.87 

0.84 

0.80 

0.78 

0.78 

0.81 

0.85 

0.89 

0.92 

-2 

0.83 

0.80 

0.77 

0.75 

0.75 

0.77 

0.81 

0.85 

0.88 

-1 

0.81 

0.78 

0.75 

0.72 

0.71 

0.73 

0.77 

0.81 

0.84 

0 

0.81 

0.78 

0.75 

0.71 

0.70 

0.71 

0.75 

0.78 

0.81 

1 

0.84 

0.81 

0.77 

0.73 

0.71 

0.72 

0.75 

0.78 

0.81 

2 

0.88 

0.85 

0.81 

0.77 

0.75 

0.75 

0.77 

0.80 

0.83 

3 

0.92 

0.89 

0.85 

0.81 

0.78 

0.78 

0.80 

0.84 

0.87 

4 

0.95 

0.92 

0.88 

0.84 

0.81 

0.81 

0.83 

0.87 

0.90 


4 Conclusion 

Although results are presented for two items, they obviously apply to more general 
item response models. Given any one-dimensional model for J > 2 binary responses in 
which local independence holds, the model applies to any two responses. For example, 
consider J > 2 items Xj with values 0 or 1. If the Xj are conditionally independent given 
a normally distributed random variable 6 with mean 0 and variance a 2 > 0 and if the 
conditional probability Pj(9) that Xj = 1 given 9 is 

p A d ) = [l + exp {-e+pj)}- 1 

for some real /3j, then Xj and are positively correlated for each j and k. Thus X 3 is 
positively correlated with the sum S = X &, so that the point-biserial correlation is 
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positive for Xj and S. The log cross-product ratio for each pair Xj and X k is less than a 2 , 
so that the correlation of Xj and X k is less than exp(<x 2 ) — 1. 

The bounds obtained can provide a basis for elementary model checking. Log 
cross-product ratios, pairwise item correlations, and point-biserial correlations are readily 
estimated without use of model assumptions. If one observes negative estimates of 
item-pair correlations, cross-product ratios or point-biserial correlations for a data set, one 
can conclude even before fitting an item response model that the data are clearly 
incompatible with any one-dimensional item response model. Further, observed values of 
marginal log cross-product ratios that are clearly outside the bounds (computation of 
which require fitting an item response model) suggested in this paper will indicate the 
misfit of the item response model employed. 
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